-
Notifications
You must be signed in to change notification settings - Fork 47
lndclient: block until chain notifier is ready #255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lndclient: block until chain notifier is ready #255
Conversation
e581dc3 to
be4fd2c
Compare
Include lightninglabs/lndclient#255 TODO: remove replace
Include lightninglabs/lndclient#255 TODO: remove replace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM starius/loop@26ee295!
LND commit lightningnetwork/lnd@c6f458e (v0.20.0-rc3) moved ChainNotifier startup later in the lifecycle
By the way, this commit (lightningnetwork/lnd@c6f458e) in LND v0.20.0-rc3 moved ChainNotifier startup back to its original location from LND v0.18.5. This location was changed in the first release candidate for LND v0.19.0, and this commit reverts it back to the v0.18.5 proper behavior
|
|
||
| // If requested, wait until the chain notifier RPC is ready before we | ||
| // return. This ensures sub-servers relying on the notifier don't fail | ||
| // during startup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Include lightninglabs/lndclient#255 TODO: remove replace
Include lightninglabs/lndclient#255 Include lightninglabs/loop#1039 wait for chain notifier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting approach had some questions.
lnd_services.go
Outdated
| // return. This ensures sub-servers relying on the notifier don't fail | ||
| // during startup. | ||
| if cfg.BlockUntilChainNotifier { | ||
| if !hasBuildTag(services.Version, "chainrpc") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this referring to the tags of lndclient or is this related to the tags sof LND which the client is connected to ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we really need to rely on build tags, will the waitForChainNotifier func not just error when there is no notifier ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this check. chainrpc is a requirement, according to lndclient/README.md
lnd_services.go
Outdated
| timeout, pollInterval time.Duration) error { | ||
|
|
||
| mainCtx := ctx | ||
| if mainCtx == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't seen this pattern so far, why don't we trust the caller to provide a non-nil context ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed it. AI glitch
lnd_services.go
Outdated
|
|
||
| for { | ||
| // Make new RegisterBlockEpochNtfn call. | ||
| subCtx, cancel := context.WithTimeout(mainCtx, timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use:
defer cancel() it is way cleaner ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the aim here is to ensure the subCtx is cancelled before we enter a new for loop iteration in the polling case (after case err := <-errChan: is triggered). But since non of the variables returned from the register function call is no longer referenced, I think garbage collection should make that isn't really an issue we need to worry about right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworked it to make a lambda function attempt which makes one attempt. It creates subCtx and cancels it in a defer statement.
lnd_services.go
Outdated
|
|
||
| // Wait for block height notification, which indicates success. | ||
| select { | ||
| case <-mainCtx.Done(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to wait for the subCtx here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think since a timeout of the subCtx won't throw an error that returns true for the isChainNotifierStartingErr, we might as return in the subCtx is done here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Changed to waiting for subCtx here.
If subCtr expires, it means, we haven't got the first block notification in 30s (default setting of cfg.RPCTimeout). RegisterBlockEpochNtfn must return the current block height immediately. So this is a real error.
lnd_services.go
Outdated
| // Wait for block height notification, which indicates success. | ||
| select { | ||
| case <-mainCtx.Done(): | ||
| cancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be removed when we. use defer cancel() above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the call, used defer statement in attempt lambda.
| } | ||
|
|
||
| // chainNotifierStartupMessage matches the error string returned by lnd | ||
| // v0.20.0-rc3+ when a ChainNotifier RPC is invoked before the sub-server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why this was not a problem before because pre 19 it was as it is now ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. This issue was only detected by LiT itest, not in Loop and anywhere else. My guess is that when pre-19 was around, there was no this LiT itest or something was different and it didn't fire.
|
/gemini review |
ViktorT-11
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good 🚀! Thanks for this 🙏! As @ziggie1984 has already commented, I think you can potentially clean up the context handling quite a bit :)
lnd_services.go
Outdated
|
|
||
| // Wait for block height notification, which indicates success. | ||
| select { | ||
| case <-mainCtx.Done(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think since a timeout of the subCtx won't throw an error that returns true for the isChainNotifierStartingErr, we might as return in the subCtx is done here.
lnd_services.go
Outdated
| if isChainNotifierStartingErr(err) { | ||
| select { | ||
| case <-time.After(pollInterval): | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without having tested this myself on the first iteration, I just want to make sure that something is logged here or during the register call. Just so that the user sees that there's still some progress and that the application isn't frozen (similar to how the waitForChainSync function triggers logging of every GetInfo poll attempt.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added logging. Each attempt and its outcome is logged with level INFO.
lnd_services.go
Outdated
|
|
||
| for { | ||
| // Make new RegisterBlockEpochNtfn call. | ||
| subCtx, cancel := context.WithTimeout(mainCtx, timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the aim here is to ensure the subCtx is cancelled before we enter a new for loop iteration in the polling case (after case err := <-errChan: is triggered). But since non of the variables returned from the register function call is no longer referenced, I think garbage collection should make that isn't really an issue we need to worry about right?
be4fd2c to
60c2e6e
Compare
Include lightninglabs/lndclient#255 Include lightninglabs/loop#1039 wait for chain notifier
|
I updated lightninglabs/lightning-terminal#1170 to use the most recent of this PR (60c2e6e) to check if it still fixes the itests. |
60c2e6e to
efe0aa0
Compare
|
Added logging |
ViktorT-11
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot 🔥, uTACK LGTM 🙏!
ziggie1984
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LND commit c6f458e478f9 (v0.20.0-rc3) moved ChainNotifier startup later in the lifecycle, so RegisterBlockEpochNtfn callers now see "chain notifier RPC is still in the process of starting" coming from Recv(). The new BlockUntilChainNotifier config option repeatedly calls RegisterBlockEpochNtfn during startup and only proceeds once the stream yields its first block height, retrying solely when we detect the ErrChainNotifierServerNotActive condition introduced by the LND commit above.
04a63fd to
d4b697c
Compare
Include lightninglabs/lndclient#255 Include lightninglabs/loop#1039 wait for chain notifier
|
Tagged as |
Include lightninglabs/lndclient#255 lndclient: block until chain notifier is ready
LND commit lightningnetwork/lnd@c6f458e (v0.20.0-rc3) moved ChainNotifier startup later in the lifecycle, so RegisterBlockEpochNtfn callers now see "chain notifier RPC is still in the process of starting" coming from Recv().
The new BlockUntilChainNotifier config option repeatedly calls RegisterBlockEpochNtfn during startup and only proceeds once the stream yields its first block height, retrying solely when we detect the ErrChainNotifierServerNotActive condition introduced by the LND commit above.
The wait is guarded by a "chainrpc" build-tag check, so deployments without the ChainNotifier service fail fast when the option is requested.
Pull Request Checklist
in
lnd_services.goare updated.macaroon_recipes.goif your PR adds a new method that is calleddifferently than the RPC method it invokes.